List of AI News about SWE Bench
| Time | Details |
|---|---|
|
2026-02-27 12:10 |
MiniMax M2.5 Beats Opus 4.6 on SWE-Bench Verified: 80.2% Score, 3x Faster, $1 Hour—AI Coding Benchmark Analysis
According to God of Prompt on X (Twitter), MiniMax M2.5 surpassed Opus 4.6 on the SWE-Bench Verified benchmark with an 80.2% score, delivers roughly 3x faster execution, and is offered at a flat $1 per hour, while using only 10B activated parameters, positioning it as the smallest Tier-1 model for coding tasks. As reported by the same source, these metrics imply lower latency and significantly reduced inference cost, enabling 24/7 autonomous coding agents and continuous integration bots at practical budgets. According to the post, the combination of high benchmark accuracy and small active parameter count suggests strong efficiency-per-dollar, which can improve ROI for software teams deploying code assistants, test repair bots, and maintenance agents in production pipelines. |